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^ Abstract. We analyze the distributed power allocation problem in parallel 

Q multiple access channels (MAC) by studying an associated non-cooperative 

game which admits an exact potential function. Even though games of this 
type have been the subject of considerable study in the literature [1—4], we 
find that the sufficient conditions which ensure uniqueness of Nash equilib- 
£N| rium points typically do not hold in this context. Nonetheless, we show that 

the parallel MAC game admits a unique equilibrium almost surely, thus estab- 
l l lishing an important class of counterexamples where these sufficient conditions 

are not necessary. Furthermore, if the network's users employ a distributed 
learning scheme based on the replicator dynamics, we show that they converge 
. to equilibrium from almost any initial condition, even though users only have 

local information at their disposal. 

t-h 1. Introduction 

> 

As a result of the massive scale at which wireless networks are deployed and 
y—i operate, non-cooperative game theory is rapidly becoming one of the main tools 

with which to describe and analyze distributed resource allocation problems in this 
*0 context. The reason for this is simple: whereas solution concepts and centralized 

y—i optimization protocols which depend on global information are very hard to justify 

or implement (especially in real time or in the presence of a large number of users), 
game theory offers a way to look at the problem from a more distributed and 
localized point of view which is often of great applicational relevance. 

A prime example of this can be seen in the huge corpus of literature surrounding 
power allocation games in static Gaussian multi-user networks with the objec- 
tive of reaching a Shannon-efficient state. The common characteristic of all these 
games is that the interference between multiple transmissions gives rise to non- 
trivial interactions between transmitters and imposes a bottleneck on the network 
performance: interference forces the power allocation policy of one user to depend 
on the power allocations of all other users. So, following [5], and given that the 
network users are left to optimally manage their resources on their own, the main 
questions that arise are a) whether there exist "equilibrial" allocations which are 
stable against unilateral deviations; 6) whether these (Nash) equilibria are unique; 
and c) whether these equilibria can be reached by distributed (learning) algorithms 
which require only local information. 
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The two most important multi-user network models that have been studied from 
this perspective are the interference channel (IC) [6] and the multiple access chan- 
nel (MAC) [7], two models which are inherently different from a communications 
point of view. On the one hand, the IC is composed of several non-cooperative 
transmitter-receiver pairs and the information-theoretic capacity region is still an 
open issue for this channel model; in fact, even in the simple case of single-input, 
single-output (SISO) two-user Gaussian IC only the achievable rates are known 
[6, 8, 9]. On the other hand, the MAC is composed of several transmitters and a 
single receiver which must decode the incoming messages, and its capacity region 
is relatively well-understood [10-12], something which remains an open problem 
for the IC. 

Perhaps the most general non-cooperative power allocation games studied in 
the context of static channels are those presented by Scutari et al. in a series of 
seminal papers [1-4] focusing on the static Gaussian IC where receivers employ the 
single-user decoding (SUD) scheme which treats incoming signals from other users 
as additive noise. There, the existence of a Nash equilibrium (in the "pure" sense 
of Rosen) is a consequence of the convexity properties of the users' achievable rates 
and follows directly from Theorem 1 in [13]; in fact, under suitable (but stringent) 
conditions on the channel matrices, this equilibrium solution is unique. 

Unfortunately, there are two issues with the approach of Scutari et al.: first, 
as the authors themselves admit, these sufficient conditions "may not be easy to 
check" [14, p. 1925] and, indeed, in most cases they are not (calculating the spectral 
radius of a matrix is very hard for large matrices). Secondly, these conditions 
are not necessary, so when they fail, the uniqueness issue is left wide open. In 
the specific case of two-user parallel IC, some progress has been made in [15], 
where the authors completely characterize the set of Nash equilibria. Depending 
on the geometric properties of the best-response functions (which are identical to 
the water-filling operators of [3]), the power allocation game may have one, two, 
three or an infinite number of Nash equilibria. Finally, in [16], assuming that the 
interference links in one of the bands are negligible, the game is shown to have 
strategic complementarities and the Nash set is studied using the super-modular 
property of the game. 

A most interesting special case of these more general games consists of the paral- 
lel MAC power allocation games which are used to model uplink communication in 
multi-cellular wireless networks composed of several nodes (receivers, access points, 
base stations, etc.) that operate in orthogonal frequency bands. From a mathe- 
matical point of view, the results of [3] obviously apply to the MAC as well, but, 
as we shall see, the sufficient conditions of [4] are never met in the parallel MAC 
case, making them irrelevant to games of this type. To compensate for this, the 
authors of [17] considered two different power allocation games in parallel multiple 
access channels, depending on the users' action sets: i) the users may distribute 
their available power among the wireless nodes; or ii) the users simply choose a 
node. There, for the first game (which is more relevant for realistic power allocation 
scenaria), the Nash equilibrium is argued to be unique, but the proof provided in 
[17] actually holds only under very restrictive conditions (otherwise, the authors' 
strict convexity arguments break down). 

In this paper, we analyze non-cooperative power allocation games in parallel mul- 
tiple access channels with the standard assumption of single user decoding (SUD) 
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at the receiver. 1 As in the more general MIMO MAC case, the parallel MAC game 
admits an exact potential (in the sense of [20]) whose extrema correspond to the 
system's sum capacity, and which can also be interpreted as the system achievable 
sum-rate if users were employing successive interference cancellation (SIC). Since 
this potential function is convex, the game's Nash equilibria will correspond to the 
minima of the potential, so the game's Nash set is necessarily convex and compact. 
However, we find that the game's potential is, in general, not strictly convex (this 
was the mistake of [17]), so one would expect that uniqueness of Nash equilibria 
fails along with the sufficient conditions of [4] . Rather surprisingly, we find that this 
is not the case: even though the conditions of [4] do not hold, the Nash equilibrium 
of the game is unique (a.s.). 

As far as convergence to equilibrium is concerned, one of the main results of [21] 
is that if the transmitters know the local channel state and the overall interference- 
plus-noise covariance matrix, then the iterative sequential water-filling algorithm 
converges to the set of equilibrium points. On the other hand, asynchronous water- 
filling is harder to analyze because the sufficient conditions of [3] are typically not 
satisfied in the parallel MAC case. Finally, in a setting similar to our own (in- 
corporating pricing but restricted to only one receiver), the authors of [22] have 
considered update algorithms which converge to equilibrium modulo certain condi- 
tions which do not always hold either. 

Instead of taking a water-filling approach, we present a learning scheme based 
on the replicator dynamics of evolutionary game theory [23] which only requires 
the players to know their channel coefficients and their rates. Dynamics of this sort 
have been studied extensively in finite Nash games ( that is, games with multilinear 
payoff functions over a strategy space which is a product of simplices - see e.g. [24] 
for a survey) and in continuous population games [23, 25], but, in the case of finite 
nonlinear games (such as the one we have here), their properties are not as well 
understood. The first step in that direction consists of identifying the correct 
modified version of the users' payoff functions which allows the replicator dynamics 
to behave well with respect to the solution concepts of the underlying game - in 
more "traditional" finite player games, this purpose is served by the payoffs that 
correspond to the pure strategies of the game, but here we have no such structure. 
Our main contribution is to then show that in parallel MAC power allocation games, 
the replicator dynamics converge to an equilibrium point unconditionally, even in 
the zero-probability event where the game has multiple equilibria. 

Notational Conventions. Throughout this paper, we will use bold uppercase 
letters to denote matrices and a dagger "f " to denote the Hermitian transpose of a 
complex matrix. 

If § = {s q }q = i is a finite set, we will denote by KS the disjoint union (categor- 
ical coproduct) KS = U fc=1 § of K copies of §. Also, recall that the (real) vector 
space spanned by S is defined as the space M s = Hom(S, M) of functions x : S — >• M, 
equipped with the usual operations of addition and scalar multiplication of func- 
tions. The canonical basis {e a }™ =1 of IR S then consists of the indicator functions 

More efficient decoding techniques such as successive interference cancellation can also be 
considered [18, 19], but optimality with respect to Shannon achievable rates will not concern 
us here; instead, the low level of signalling and decoder complexity of the SUD makes it more 
suitable for learning purposes. Furthermore, when using successive interference techniques, the 
exact potential property of the game is lost in general. 
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e a : S — > K which take the value e a (s a ) = 1 and vanish otherwise. Hence, under 
the natural identification s a <— > e a , we will use the index a to refer interchangeably 
to either s a or e a , depending on the context. Similarly, we will also identify the set 
A(S) of probability measures on § with the standard (n-l)-dimensional simplex of 
R s : A(§) = {x e M s : £ Q x <* = 1 and x a > 0}. 

Finally, as far as players and their strategies are concerned, we will consistently 
employ Latin indices for players (k,£, ...), while reserving Greek ones for their 
("pure") strategies (a, /3, . . .). 

2. The System Model 

Following [17], the basic setup of our model is as follows: we have a set % = 
{1, . . . , K} of finitely many wireless (single-antenna) transmitters - the players of 
the game - that wish to connect to a network of wireless nodes A = {1, . . . , A} 
(for instance, a collection of base stations or access points). For simplicity, we are 
assuming that these nodes operate at distinct, non-interfering frequency bands, so 
that a user k € % may split his transmitting power among the nodes a G A subject 
to the power constraint: 

(1) y\ p ka <p k , 

* — 'a 

where pk a is the power with which user k transmits towards node a and Pk is 
the user's maximum transmitting power. As a result, the power allocation of the 
fc-th user will be represented by the point — pka^-a 

G R , while, in obvious 

notation, the corresponding power profile which collectively reflects all of the users' 
power allocations will be represented by p = [p\,. . . ,Pk) € M. KA . 

Thus, under the standard assumption of single user decoding (SUD), the spectral 
efficiency of user k in the power profile p will be given by [17, 22]: 

(2) Uk {p) = J2 «*«(p) = E fo « lQ g f 1 + 2 J" aPka — ) , 

where: 

(1) b a — B a /B > is a normalized version of the bandwidth B a of the node 
a E A, rescaled to unity by the total bandwidth factor B = J2 a ^a- 

(2) gka > is the channel gain of user k with respect to node a, assumed 
here to be static for the duration of the transmission, known to user k, and 
drawn from a continuous (and nonatomic) probability distribution on the 
positive real numbers - see also the relevant assumptions in [3, 14]. 

(3) cr„ > represents the noise level associated to node a (typically the variance 
of a Gaussian noise process). 

Remark 1. It should be noted here that when the wireless users are spatially dis- 
tributed, the set A of wireless nodes need not be common to all users. As it turns 
out, it is not too hard to extend our analysis and results to this more general case, 
but, to keep our presentation as clear as possible, we will only consider the case 
where every user can reach every node. 

Remark 2. We should also stress here that the channel gain coefficients gka are the 
only stochastic parameters in our model, and, in our static channel setting, they 
are given by g^a = |^fca| 2 , where h is a realization of the continuous random matrix 
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which describes the channel - see also [17]. So, unless explicitly mentioned other- 
wise, any probabilistic statement we make in this paper will refer to the probability 
law of the random variables gk a - 

Now, as intuition would suggest (and as was shown rigorously in [17]), when the 
users' utility is based solely on their spectral efficiency (2), it is clearly to the users' 
best interest to transmit at the highest possible total power, i.e. satisfying (1) as 
an equality." As a result, we obtain the following components of a normal form 
game 25: 

(1) The set of players of <S is X = {1, ... , K). 

(2) The strategy space of player k is the (scaled) simplex A& = {pk € R^ : 
Pka > and J2 a Pka = Pk}', as is customary, we will denote the game's 
space of strategy profiles p = (pi,...,p K ) by A = Y[ k Afe. 

(3) The players' payoffs (or utilities) are given by the spectral efficiencies Uk '■ 
A->Rof(2). 

Of course, the game © defined in this way is not finite (in the original sense of 
[26]) because a) the players are not mixing over a finite set of possible actions; and 
b) even though the players' strategy spaces happen to be simplices, their payoffs 
are not multilinear over them. On the other hand, since A is a convex polytope and 
the utilities Uk of the users are concave functions of their power allocations pk , we 
immediately see that the game 25 is concave in the sense of Rosen [13]. Moreover, 
it was shown in [17] that 25 is actually an exact potential game,' 5 i.e. that it admits 
a (global) potential function $ : A — > R such that: 

(3) WfeO-fcSPfe) - Uk(P-k',Pk) = ®(P-k,Pk) - ^{P-k;p' k ), 

for all players k £ X, and for all power allocations Pk,p'k G ^k of user k and 
p^k € A_ fe = Il^fc &t °f fc ' s opponents X_ k = 3C\{fc}. 4 

In fact, the authors of [17] provided the following explicit form for the potential 
function $: 

(4) $(p) = ~ H Q 5 « lo S + ^2 k 9k a Pka) ■ 

For posterity, note here that $ is itself convex, but not necessarily strictly so: 5 
indeed, any two power profiles p,p' e A such that J2k 9kaPka — J2 k 9kaPk a f° r 
all a € A will also have $(p) = $ (p 1 ). This simple observation will be of crucial 
importance in determining the Nash set of the game, so we will pause here to 
introduce the concept of degeneracy. 

To that end, let T p A denote the tangent space of A at p. Since A is an affine 
polytope embedded in R KA , it is easy to sec that for every interior point p € Int(A), 
T p A will be isomorphic to the subspace which is "parallel" to the polytope A: 

(5) T p A = Z = {ze R KA : J2 a z ka = for all k e X} . 



Of course, this need not be true if the cost of power consumption is too high [22], but we will 
not deal with this issue here. 

^In the finite player sense of Monderer and Shapley [20], and not in the continuous sense of 
[25]. 

4 The change of signs in (3) from [20] is deliberate. Our convention was chosen so as to conform 
with physics, where it is the minima of the potential function that are stable. 

5 This is precisely the subtle mistake that underlies the equilibrium uniqueness argumentation 
of [17]. 
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(a) ind((S) = 0: generic level sets (b) ind((5) > 0: degeneration into affine sets 



Figure 1. The level sets (dashed blue lines) of the potential func- 
tion $ in a 2 x 2 game with and without degenerate directions 
(Figs. 1(b) and 1(a) respectively). Degeneracy nullifies degrees of 
freedom and introduces redundant directions in the system. 



However, as we just noted, some of these KA — K directions will be degenerate (or 
redundant), in the sense that the potential $ remains constant as we move along 
them. Specifically, the set of (almost surely independent) constraints 

(6) J2k 9k a z ka = 0, a e A, 

cuts itself a (KA — A)-dimensional subspace W of R KA whose intersection with Z 
will correspond to the total of K + A constraints: 

(7a) a) 7 Zka = 0, k € %; 

(7b) b) V* QkaZka = 0, aeA. 

* — • k 

Of course, the K tangent space constraints (7a) are set in stone while the A 
degeneracy constraints (7b) depend on the realization of the channel gains gka- 6 It 
is thus possible (though improbable) that some of the constraints (7) are linearly 
dependent. To keep track of all this, we have: 

Definition 1. The subspace W < R KA defined by the constraints (7b) will be 
called the space of degenerate (or redundant) directions of the game 0. Moreover, 
wc define the degeneracy (or redundancy) index of © to be: 

(8) ind(gi) = dim(T4^nZ), 

where Z is the tangent space determined by the admissibility constraints (7a). 

Example 1. As we just saw, ind(lS) = KA — K — A (a.s.), so there is no degeneracy 
in games with K = 2 users and A = 2 nodes. However, if the channel gains of 
the two users happen to be linearly dependent (a zero-probability event to be sure, 

^These conditions are remarkably similar to the MIMO rank condition rank(HtH) = 
n,t,k < n r + K (where H = [Hi, . . . , Hj-] is the system's channel matrix) which ensures 
that there exists a unique Nash equilibrium [27]. 
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but one which could be approximated reasonably well by strongly collocated users), 
then we can have degenerate directions even in a 2 x 2 game (see Fig. 2). In that 
case, the potential function $ ceases to be strictly convex, so there is no a priori 
reason that the potential's minimum set will be a singleton. 



3. Equilibrium Analysis 

In this section, our main goal will be to describe the Nash set of the game and, 
more specifically, to show that it consists (almost surely) of a unique equilibrium 
point which is located at a face of the strategy space A. 

This problem has attracted considerable interest in the literature, where the 
papers by Scutari et al. [1-4] provide a set of sufficient conditions for uniqueness in 
more general interference channel scenaria, and, more recently, in [17], where the 
authors focus on the parallel MAC problem. Unfortunately, these approaches are 
problematic (for different reasons): on the one hand, we will see that the sufficient 
conditions of Scutari et al. [1-4] do not hold in our setting; on the other hand, the 
uniqueness proof of [17] is only valid under the extremely restricting condition that 
the game is non- degenerate, i.e. that KA < K + A (otherwise, the potential <!> is 
not strictly convex). 

Indeed, especially this last condition holds for very few scenarios, only one of 
which is (barely) non-trivial: a) when we have K = 1 user and an arbitrary number 
A > 1 of nodes (in which case the problem reduces to an optimization one solved by 
water- filling [28]); b) when an arbitrary number of K > 1 users transmits with the 
maximum possible power to a single node shared by all (A = 1; this was also the 
scenario studied by [22] who also introduced a linear pricing scheme to compensate 
for power costs); and c) in the 2a;2 case which is easy to solve directly. In spite of 
the above, our main result in this section is that the sufficient conditions of Scutari 
are actually far from necessary: for (almost) any realization of the channel gain 
parameters g ka , there exists a unique Nash equilibrium. 

3.1. Nash Equilibrium Conditions. Since we have a finite number of players, 
the notion of Nash equilibrium takes the form of stability in the face of unilateral 
deviations. More specifically: 

Definition 2. We will say that the power profile q G A is at Nash equilibrium in 
the game © when 

(9) u k (q)>u k (q^ k ;q k ), 

for all k £ 3C, and for every deviation q' k G A k of player k. 

In particular, if q satisfies the strict version of the inequalities (9), then it will 
be called a strict equilibrium of (3. 

As is standard in convex potential games, to calculate the Nash set A* = A* ((5) 
of the game, we only need to look at the (necessarily convex) minimum set of the 
potential function $. To that end, the first order constrained Karush-Kuhn- Tucker 
minimization conditions [17] show that a power profile q S A will be at Nash 
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equilibrium if and only if: 

(10a) a) X k - 2 J^ ka >0 

(10b) b) q ka ( X k - , , 6 ° gte ) = (). 



for all players k € % and all nodes a £ A (and with the obvious constraints imposed 
by the condition q 6 A). 

An obvious observation that can be gleaned from the above is that if q is a Nash 
equilibrium, then either a) the support supp(gfe) = {a 6 A : qka > 0} of a user's 
power allocation is a singleton (i.e. the user only transmits to a single node); or 
b) we will have the "waterfilling" condition: 

(11) = — for all a, (3 6 supp(gfc), 
where r a is the user-independent quantity given by: 

(12) r"^^-^ . 

In other words, if user k connects to more than one node and is at equilibrium, then 
he must be "waterfilling" the quantity gka/fa among the nodes that he employs. 

A promising way to determine whether our game admits a unique equilibrium is 
to take advantage of the plethora of sufficient conditions that have been established 
in the literature for this purpose. In our setting, the condition which is easiest to 
check was the one proposed by Scutari et al. in [4, Equation (21)], and which takes 
the form: 

(Cmax) /o(S max ) < 1, 

where p(S max ) = max{|A| : A € eig(S max )} is the spectral radius (i.e. the eigenvalue 
with the largest modulus) of the K x K matrix S max = {S™*} defined as: 



(is) s^r - 



0, k = t, 

max Q {gia/gka} ■, k^ i. 



However, since m&x a {g ia / g ka } = (min a {g ka / gia})" 1 > (Sf™*)' 1 , we imme- 
diately see that the entries of S max satisfy the inequality S'^ ax S'^, ax > 1 for any 
distinct pair of users k, I € %. Hence, tr(S^ ax ) will be bounded from below by: 

(14) tr(SLJ = E M S krSTr > K(K 1), 
and, by the spectral radius bounds of [29] , we will have: 

(15) p(S max ) > 




S(S-l) 



where S = rank(S max ).' However, since tr(S max ) = by definition, (15) gives 
p(S max ) > 1, so the sufficient condition (Cmax) fails. 



Strictly speaking, (15) holds if S > 2, but we can trivially disregard the case S < 2, because 
S = K almost surely and the one- user case holds little interest. 
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Other sufficient conditions were put forth in [1-3] and [30] , based on the matrices 
S(a) = {Ski(a)} defined as: 



then the game 25 admits a unique Nash equilibrium; in a similar vein, the authors 
of [30] proposed the condition: 



where 0" signifies positive-definiteness. 

Of these two conditions (CI) is stronger than (Cmax) in the sense that (Cmax) 
is sufficient for (CI). However, the same analysis as before shows that in the case 
of the S(a) matrices, (14) holds as an equality, so we still get p(S(a)) > 1 for 
all a € A, causing (CI) to fail. Similarly, even though the positive-definiteness 
condition (C2) is independent of (CI) and (Cmax), the definition of S(a) yields 
Ski(ot) + Sik(a) > 2 for all k ^ i. Consequently, the element with the largest 
modulus of the symmetrized matrix I + |(S(a) + ST (a)) does not lie on the main 
diagonal, so the matrix 1 + S(a) cannot be positive-definite either. 

Remark. Strictly speaking, condition (CI) was phrased in [1] in terms of a slightly 
different version of the matrix S(a) where Ske(a) — whenever the channel of a 
is "too bad" for either k or I (in a sense made precise in [1]). In this more general 
setup, if a is "bad" for user fc, then the fc-th row and fe-th column of S(a) vanish, 
so the bound (14) is decreased to (K — r){K — r — 1), where r is the number of 
zero rows and columns that were introduced in S(a). However, this also reduces 
the rank of S(a) accordingly, so, assuming that rank(S(a)) > 2, the bound (15) 
still gives p(S(a)) > 1. 

Of course, this still leaves open a small window where the condition (CI) might 
be salvaged - namely the rare occurence where the S(a) matrices all have rank 1 
or less. However, instead of focusing on this very special case, we note that even 
the extensive numerical simulations of [1] show that the sufficient condition (CI) 
almost never holds in the parallel MAC setting. Indeed, if we follow [1] and assume 
for simplicity that the transmitter-receiver distances are all equal (d qr = d rq in 
their notation), then the "normalized interlink distance" becomes equal to 1 and 
Figure 1 of [1] reveals that (CI) fails almost surely. 

We thus see that, despite their theoretical value, the sufficient conditions that 
have been established in the literature are quite problematic in the parallel MAC 
setting because they are typically never met (except possibly in some very special 
cases). Therefore, in order to address the uniqueness issue in complete generality, 
we will need to develop a different set of tools. 

3.2. Representing Power Profiles as Graphs. As we shall see, the "waterfill- 
ing" conditions (11) impose some pretty severe constraints on the structure of the 
equilibrium set A*, because whenever a user waterfills between nodes, his channel 
gains must "split", i.e. be of the form g^ a — \r a . This is actually best understood 
pictorially, by representing a power profile p € A as a graph: 



(16) 





p(S(a)) < 1 for all a e A, 



(C2) 



I + S(a) >p for all a € A, 
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Definition 3. We will say that the (multi)graph S = (V, £) represents the power 
profile p = {pi, . . . ,p K ) £ A if: 

(1) V = A: the nodes of S coincide with the network's; 

(2) for each k £ %, there is a node a £ A (called the hub of user k in S) to 
which the user assigns positive power pk a > 0, and which is joined by an 
edge of 9 to every other node /3 £ supp(p^) 

In simpler words, to represent a power profile p £ A as a graph, one merely has 
to take the set of wireless nodes as the set of the graph's nodes, and then, for every 
user k £ %, to pick a node which the user employs and connect it with an edge to 
every other node to which the user assigns positive power. Of course, depending 
on the choice of "hub" for each user k £ %, one might end up with non-isomorphic 
graphs representing the same power profile p. However, this lack of uniqueness will 
not be important to us, so we will occasionally abuse Definition 3 by using S(p) to 
collectively denote any graph which represents the profile p £ A. 

In light of the above, we now state a few key lemmas and corollaries that will 
be crucial in our efforts to understand the structure of the equilibrial set A* . The 
first one is an elegant structural property of equilibrial graphs: 

Lemma 4. Let S = 5(p) represent a power profile p £ A* which is at Nash equi- 
librium. Then S is almost surely a forest - that is, S contains no cycles. 

Proof. The intuitive idea behind this lemma is that if there is a cycle, then we can 
get a chain of fractions gk 1 ,a 1 /9k ,a , 9k 2 ,a 2 / '9k 2 .an ■ ■ ■ , which will have a product 
equal to 1 because of the waterfilling condition (11). However, this represents a 
condition on the g's which occurs with zero probability, thus providing a contra- 
diction. 

To make this idea precise, assume that S contains a cycle T denoted as a sequence 
of edges T — (ei, . . . , e„). Since an edge can only be owned by a single player, this 
cycle gives rise to a sequence of players which we also denote by T — (ki, . . . , k n ). 

So, if (ao, ax, . . . , a n ) is the corresponding sequence of nodes that T passes 
through (obviously, ao — a n ), then (11) gives: 

(17) 9hi ' aj = - r °^-, for aU j = {1, . . . , n}. 

Therefore, multiplying these n equations together, we get: 

^g-j fffci.cti 9k n ,a n Tax r a n _ 

9k ,a 5fcn-l,a re -l r a r a„-i 

Since there are no cancellations in this last equation (recall that the nodes Oj, 
j = 1, . . . ,n — 1 of r are all distinct), it will describe a measure zero submanifold 
of the space from which the channel coefficients g are drawn. As a result (18) only 
holds with probability zero and, hence, the assumption that S contains a cycle is 
almost surely false. □ 

As an easy corollary of the above lemma, we also get: 

Corollary 5. If p £ A* is an equilibrial power profile, then there are (a.s.) at most 
A — 1 instances of waterfilling (i.e. two nodes employed by the same player). 



^Note that keeping track only of the nodes is not enough because two distinct edges might link 
the same pair of nodes. 
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Figure 2. A graph representing a power profile in a game with 
3 users (red, blue and green) and 4 nodes (a, /3,7 and S). In the 
profile represented above, the red player uses a, j3 and S, blue uses 
a, P and 7, and green employs /3, 7 and <5. 



Proof. Simply note that a forest on A nodes can have at most A — 1 edges; our 
result then follows by recalling that an edge in this context simply represents an 
instance of waterfilling. □ 

From a geometrical point of view, this shows that Nash equilibria can only live 
on the faces of the strategy space A: 

Corollary 6. Let p £ A* be an equilibrial power profile. Then p lies in the interior 
of an at most (A — 1)- dimensional face of A (a.s.). 

Proof. Since a user who waterfills between m nodes transmits with zero power 
towards the rest of the nodes, his power allocation pk will belong to the interior 
of an (m — l)-dimensional face of A&. The result then follows by combining this 
observation with Corollary 5. □ 

We thus see that the Nash set A* has to be contained in the interior of a face of 
A of dimension at most A — 1. We will now show that A* is actually a singleton: 

Theorem 7. The game (5 has a unique Nash equilibrium (a.s.). 

Proof. The basic idea of the proof is, essentially, a geometric one. Indeed, assume 
that there are two distinct equilibrial points, p and p' , sitting at the interior of some 
d-dimensional face p of A. By convexity, the linear segment spanned by p and p' 
will also belong to the Nash set A* which implies that this line segment must (a.s.) 
lie in the subspace W of degenerate directions of the game. 

In other words, we see that if there is not a unique Nash equilibrium, then the 
subspace W of degenerate directions intersects nontrivially with a d-dimensional 
face p whose interior contains an equilibrium. However, since dim(iy) = KA — A 
(a.s.), Corollary 6 gives dim(/o) + dim(VF) < KA - A + A - 1 < KA. On the other 
hand, it is well-known that two generic subspaces of a real vector space V intersect 
nontrivially if and only if the sum of their dimensions exceeds dim(V), so, since 
A is embedded in R KA , we may conclude that p and W intersect trivially with 
probability 1, a contradiction. □ 
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4. Convergence to Equilibrium 

Having determined the properties of the game's unique equilibrium point, our 
task in this section will be to present a decentralized learning scheme which al- 
lows users to converge to this equilibrium point (and to estimate the speed of this 
convergence). 

Given that the structure of the game © does not adhere to the (multilinear) 
setting of Nash [26] , 9 the usual theory of evolutionary "random-matching" games 
does not apply either. This leaves on a rather unclear position on how to proceed, 
but since players invariably want to increase their rewards and an increase in payoff 
is equivalent to a decrease in potential, we will begin by considering the directional 
derivatives of the potential function <£: 

3$ b a g ka 



(19) v ka (p) 



Clearly, if a player transmits with positive power to node a, then he will be able 
to calculate the gradient Vk a (p) in terms of the observables pk a (the user's power 
allocation) , gk a (his channel gain coefficients) , and the spectral efficiency Uk a (p) = 

log (f + 

(°"q +Yji^Lk9iaPioS) of (2) which user k observes at node a. 10 
As a result, any learning scheme which relies only on the Wfe Q 's will be inherently 
distributed in the sense that it only requires information that is readily obtainable 
by the individual players. 

With all this in mind, a particularly simple scheme to follow is that of the replica- 
tor dynamics [23] associated with the "marginal payoffs" Vk a . More specifically, this 
means that the players update their power allocations according to the differential 
equation: 

(20) d ^=p ka {v ka (p(t))-Mp(t))), 

where v k is just the user average v k (p) = Pk 1 J2p Pkpv k p{p). 

As usual, the rest points of (20) are characterized by the (waterfilling) property 
that, for every pair of nodes a, (3 6 supp(p) to which user k allocates positive 
power, we will also have Vk a {p) = Vkpip)- Hence, comparing this to the KKT 
conditions (10), we immediately see that the Nash equilibria of <3 are stationary in 
the replicator equation (20). This result is well-known in finite Nash games with 
multilinear payoffs [24] and in continuous population games [25], but the converse 
does not hold: for instance, every vertex of A is stationary in (20), so stationarity 
of (20) does not imply equilibrium. 

Nevertheless, only Nash equilibria can be attracting, and, in fact, they attract 
almost every replicator solution orbit: 

Theorem 8. Let q £ A be the unique (a.s.) equilibrium of 25. Then, every solution 
orbit of the replicator dynamics (20) which begins at finite Kullback-Leibler entropy 
from q will converge to it. 



Or even the continuous population models of [25]: there are no "node-specific" rewards in our 
problem like the "phenotype-specific" growth rates of evolutionary biology. 

Note that this is different from gradient techniques applied to the utility functions them- 
selves, a practice which requires the utility functions to be known. 
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(a) Global convergence to equilibrium 
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(b) Convergence in degenerate games 



Figure 3. Convergence to equilibrium in 2x2 power allocation 
games (the dashed grey contours represent the level sets of the 
game's potential) . If the game admits a unique equilibrium (which 
is almost always the case), then this equilibrium is (almost) glob- 
ally attracting (Fig. 3(a)). However, even when the game has more 
than one equilibria (Fig. 3(b)), every interior replicator trajectory 
converges to an equilibrium point. 



Furthermore, even if the game does not admit a unique equilibrium, every interior 
trajectory still converges to a Nash equilibrium (and not merely to the Nash set of 
the game). 

Remark. Recall that the Kullback-Leibler divergence (or relative entropy) of p with 
respect to q is [23]: 

(21) H q {p) = H qk {p k ) = ^ q ka log (q ka /p ka ) ■ 

Clearly, H q (p) is finite if and only if p k allocates positive power p ka > to all nodes 
a G supp(q) which are present in q k ; more succinctly, the domain of H q consists of 
all power allocations which are absolutely continuous w.r.t. q. 

This convergence result (proved in Appendix A) is extremely powerful because 
it shows that the network's users will eventually settle down to a stable state which 
discourages unilateral deviations, even though they only have local information 
at their disposal. The only case that is left open in the above theorem is what 
happens if the initial K-L entropy of the solution orbit is infinite, i.e. if the users' 
initial power allocation does not support all of the nodes which are present in 
equilibrium. If this is the case, then the face-invariance property of the replicator 
dynamics (pka (t) — iff pka — 0) will prevent the users from settling down to a Nash 
equilibrium. However, an easy analysis shows that if one takes the reduced game 
where each user only has access to the nodes to which he initially allocates positive 
power, then the users will actually converge to an equilibrium of this reduced game: 
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Proposition 9. Let p(0) — (pi(0), . . . ,pk (0)) be an initial power allocation profile 
in the game & and let Ak = supp(pfc(0)) C A. Then, if 0° is a reduced version of 
© which is played over A = J} fc Pk^{Ak) with payoffs 

(22) ulip) =Y,b a log [ 1 + 2 - 9 " aPka ) , 

the replicator dynamics (20) converge to the unique (a.s.) equilibrium of the reduced 
game 25°. 



5. Conclusions 

In this paper, we studied distributed power allocation in parallel multiple access 
channels, modeling e.g. the problem of uplink communication in networks that 
consist of wireless receivers that operate in orthogonal frequency bands. Despite 
the fact that these games are special cases of the more general framework of [1-4], 
the sufficient conditions provided therein for uniqueness of equilibrium typically fail 
in our case. Nonetheless, we show that the game does admit a unique equilibrium 
by studying the properties of the game's exact potential function (and correcting a 
mistake of [17] in the process). Furthermore, by introducing a distributed learning 
scheme based on the replicator dynamics of evolutionary game theory, we show 
that users converge to the game's unique (a.s.) equilibrium. This result carries 
significant applicational potential because it ensures convergence to equilibrium 
even in decentralized settings where the users only have access to local information 
(in our case, the channel coefficients gka and the spectral efficiencies Uk a )- 

Since the parallel MAC game is a special case of the more general IC one, a 
natural question that arises is whether our analysis extends to this more general 
case as well. One immediate observation is that the convergence properties of the 
replicator dynamics are still valid in general convex potential games played over 
products of simplices, but since the MIMO MAC game is actually played over the 
polytope of non-negative definite channel matrices with constrained trace, it is not 
as easy to write a continuous-time equation there. Further directions to be explored 
include the speed of convergence of the replicator dynamics to equilibrium (which 
can be shown to be exponentially fast) and the robustness of the replicator dynamics 
under stochastic disturbances which reflect inaccuracies in the users' observations 
(e.g. of the channel coefficients gka)- However, a disicussion of these issues would 
take us too far afield (and well beyond the space limitations of this paper), so we 
prefer to postpone them for the future. 



Appendix A. Proof of Convergence 

This appendix is devoted to the proof of Theorem 8. The basic idea will be 
to show that the replicator dynamics (20) admit a Lyapunov function, i.e. a non- 
negative function / : A — > K with f(p) = if and only if p = q and such that 
f(p) = Efe,Q df/dPkaPka < for all p ^ q. 
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A particularly appealing candidate is the game's own potential function <£>. In- 
deed, an easy differentiation of (4) yields d$/dp ka — — du k /dp ka , so we obtain: 



' J k.a 



(23) = £ fc P * 



dt *-^k,a dpka dt 

Vka(p(t))Pka(t) (Vka(p(t)) ~ V k (p(t))) 



\ - Pka(t) 2 



<(pW) - 4(p(t)) 



<o, 



by Jensen's inequality (recall that ^2 a Pka — Pk)- Since this inequality is strict 
if all the p ka are positive and v ka ^ Vkp for a / ft this proves convergence to 
equilibrium when the game only has a unique equilibrium and the game has no 
degeneracy. 

To get the more general case (and, also, for independent interest), it is much 
more instructive to consider as a Lyapunov candidate the relative entropy H q itself. 
Indeed, a simple differentiation gives: 

^ = -V %L = _V *-(«*-&»(«))-«*(?(*))), 

dt *-~ik,a Pk a (t) dt 'fc,a 

and, after rearranging the last term, we get: 
iH_ 
dt 



(25) ^T=E t (Pk«(*) - v ka(j>(t)) = -i,(p(*)), 



(28) /'(<?) = 



where 

(26) L g (p) = - V (Pka-qka) Vka (p) • 

* — '/c,a 

We are thus left to show that L q {p(t)) > and, to that end, the key observation 
is that L q may be interpreted as a directional derivative of So, let us set f(0) = 
&(q + 9z), where 6 > and z is a vector in the (solid) tangent cone T q A of A at q: 

(27) T q A = {ze Z : z ka >0 for all a with q ka = 0}, 

i.e. T q A consists of those tangent directions z £ Z which point towards the interior 
of A (recall that q might lie on the boundary of A). Clearly then, (26) may be 
rewritten as: 

z ka = 9 L q (q + 9z), 

q+0z 

for all sufficiently small 9 > such that q + 9z 6 A. 

However, since q is the unique minimum of <!> (a.s.), f(9) will be convex along 
any direction z <E T£A, so that 9f'(9) > f(9) - /(0). Hence, if p = q + 9z is an 
arbitrary point of A, equations (27) and (28) yield the growth estimate: 

(29) L q (p) = 9f(9) > f(9) - /(0) = $(p) - <%). 

This last estimate shows that L q (p) > for all p q, thus concluding our proof 
of Theorem 8 for the non-degenerate case (note that then — $((/) > for all 

q^p)- 

To tackle the degenerate case, a semi-definite Lyapunov function (such as the 
game's potential $ or the relative entropy H q ) is not enough because it ensures 
convergence to the set of minimum points and not to an actual point. Clearly, the 
replicator dynamics in degenerate games might, in principle, exhibit phenomena of 
this kind. However, there is much more at work in (20) than a single semi-definite 



16 P. MERTIKOPOULOS, E. V. BELMEGA, A. L. MOUSTAKAS, AND S. LASAULCE 




Figure 4. The sets in the proof of Theorem 8. 

Lyapunov function: there exists a whole family of such functions, one for each 
equilibrium q. 11 

To take advantage of this, it will be useful to shift our point of view to the evo- 
lution function Q(p,t) of the dynamics (20) which describes the solution trajectory 
that starts at p at time t = and which satisfies the consistency condition: 

(30) 0(p, t + s) = 8(00, *), s) for all t, s > and p £ A. 

So, fix some initial condition p £ Int(A) (or, more generally, p £ A 9 where A q is 
the domain of the relative entropy function H q ) and let p(t) = <d(x,t) be the cor- 
responding solution orbit. If q £ A* is Nash, then, in view of the above discussion, 
the function V q (t) = H q (Q(p,t)) will be decreasing (though, perhaps, not strictly 
so) and will converge to some m > as t — > oo. It thus follows that p(t) converges 
itself to the level set H~ 1 (m). 

Suppose now that there exists some increasing sequence of times t n —¥ oo such 
that p n = p{t n ) does not converge to the Nash set A* = A* ((5). By compactness 
of A (and by descending to a subsequence if necessary) , we may assume that p n = 
Q(p,t n ) converges to some p* £ A* (but necessarily in iJ~ 1 (m)). Hence, for any 
t > 0: 

(31) H q (@(p,t n + t)) = H q (e(Q(p,t n ),t)) -> H q {G(p*,t)) < H q (p*) = m 

where the (strict) inequality stems from the fact that H q < outside A*. On the 
other hand, H q (8(p, t n + t)) — V q (t n + t) — >• m, a contradiction. 

Since the sequence t n was arbitrary, this shows that p(t) converges to the set A*. 
So, let q' be a limit point of pit) withp(t^) — > q' for some sequence of times t' n — > oo. 
Then, V q >it' n ) — H q i (j>if n )) will converge to zero and, with V q i decreasing, we will 
have lim^oo V q >(t) = as well. Seeing as H q i only vanishes at q' , we conclude that 
p(t) — > q', i.e. every interior trajectory converges to equilibrium. 
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